Post-Masking Optimization of the Tradeoff between Information Loss and Disclosure Risk in Masked Microdata Sets

نویسندگان

  • Francesc Sebé
  • Josep Domingo-Ferrer
  • Josep Maria Mateo-Sanz
  • Vicenç Torra
چکیده

Previous work by these authors has been directed to measuring the performance of microdata masking methods in terms of information loss and disclosure risk. Based on the proposed metrics, we show here how to improve the performance of any particular masking method. In particular, post-masking optimization is discussed for preserving as much as possible the moments of first and second order (and thus multivariate statistics) without increasing the disclosure risk. The technique proposed can also be used for synthetic microdata generation and can be extended to preservation of all moments up to m-th order, for any m.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An evolutionary approach to enhance data privacy

Dissemination of data with sensitive information about individuals has an implicit risk of unauthorized disclosure. Perturbative masking methods propose the distortion of the original data sets before publication, tackling a difficult tradeoff between data utility (low information loss) and protection against disclosure (low disclosure risk). In this paper we describe how information loss and d...

متن کامل

Outlier Protection in Continuous Microdata Masking

Masking methods protect data sets against disclosure by perturbing the original values before publication. Masking causes some information loss (masked data are not exactly the same as original data) and does not completely suppress the risk of disclosure for the individuals behind the data set. Information loss can be measured by observing the differences between original and masked data while...

متن کامل

Automatic Generation of Masked Microdata

Disclosure Control is the discipline concerned with the modification of data containing confidential information about individual entities, such as persons, households, businesses, etc. in order to prevent third parties working with these data from recognizing entities in the data and thereby disclosing information about these entities. In very broad terms, disclosure risk is the risk that a gi...

متن کامل

Preserving Edits When Perturbing Microdata for Statistical Disclosure Control Ntalie Shlomo, Ton De Waal

To protect individuals in microdata from the risk of re-identification, a general perturbative method called PRAM (the Post-Randomization Method) is sometimes used for masking records. This method adds “noise” to categorical variables by changing values of categories for a small number of records according to a prescribed probability matrix and a stochastic process based on the outcome of a ran...

متن کامل

Microdata Protection

Governmental, public, and private organizations are more and more frequently required to make data available for external release in a selective and secure fashion. Most data are today released in the form of microdata, reporting information on individual respondents. The protection of microdata against improper disclosure is therefore an issue that has become increasingly important and will co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002